## Lab 3 - Buffer Insertion

Name: Yi Yao ID: 011743488

#### **Problem**

- Satisfy the given timing constraint.
- Minimize the total buffer size (the sum of BUF\_X#).
- Submit
  - Final DEF file (see the next slide)
  - Final timing report (a screenshot or copy & paste)
  - Total buffer size
  - A brief description of the optimization methodology you used.

# **Background**

- The design has a two-input NAND gate.
- $g_out= (g_in[0] AND g_in[1])$
- Timing constraint: 500ps
- Layout width: 5,000um
  - The two input pins are on the left side of the layout.
  - The output pin is on the right side.
  - The NAND gate is on the left side.
  - Thus, the distance between the output of the NAND gate and the output pin <code>g\_outis</code> almost 5,000um.
  - You are supposed to minimize the delay.
- Buffer Types: BUF\_X1, BUF\_X2, BUF\_X4, BUF\_X8, BUF\_X16, BUF\_X32
- Raw data: Arrival Time = 2.323, Slack Time = 1.823

```
Endpoint: g_out (^)
Beginpoint: g_in[1] (v) triggered by leading edge of '@'
Path Groups: {default}
Analysis View: NG_view_typ
   External Delay
  Path Delay
Required Time
   Arrival Time
   Slack Time
Clock Rise Edge
                                                            0.000
0.000
0.000
         Input Delay
Beginpoint Arrival Time
                                                    Cell
                                                                 Delay
                                                                                           Required
             Pin
                                                                             Arrival
                         Edge
                                      Net
                                                                              Time
                                                                                              Time
                                   g_in[1]
g_in[1]
          g_in[1]
                                                                                0.000
                                                                                              -1.823
                                                  NAND2_X1
                                                                               0.000
                                                                 0.000
          U1/A2
                                                                                              -1.823
                                                  NAND2 X1
          U1/ZN
                                                                 0.475
                                                                                0.475
                                                                                              -1.348
                                   g_out
                                   g_out
                                                  VBI
                                                                 1.848
                                                                                2.323
                                                                                               0.500
          g_out
```

## The Optimization Methodology

From the Background, we should take into account three factors to optimize the circuit delay. They are Buffer number = N, Location = L, and Burffer types =  $BUF_X$ #.

#### 1. Delay minimization of a long wire



#### 2. Insert buffers

$$s_{1} (\mu m) \qquad s_{2} (\mu m)$$

$$N-1 \text{ buffers}$$

$$\tau = R_{input} \cdot (\frac{C_{wire}}{\frac{L}{S_{k}}} + C_{in}) + \frac{R_{wire}}{\frac{L}{S_{k}}} \cdot C_{in} + \frac{1}{2} (\frac{C_{wire}}{\frac{L}{S_{k}}} \cdot \frac{R_{wire}}{\frac{L}{S_{k}}})$$

$$\tau_{all} = R_{input} \cdot C_{wire} \cdot \frac{s_1 + \dots + s_N}{L} + N \cdot R_{input} \cdot C_{in} + R_{wire} \cdot C_{in} \cdot \frac{s_1 + \dots + s_N}{L} + \frac{R_{wire} \cdot C_{wire}}{2L^2} (s_1^2 + \dots + s_N^2)$$

$$= R_{input} \cdot (C_{wire} + N \cdot C_{in}) + R_{wire} \cdot C_{in} + \frac{R_{wire} \cdot C_{wire}}{2L^2} (s_1^2 + \dots + s_N^2)$$

Minimize 
$$T(s_1, \dots, s_N) = s_1^2 + \dots + s_N^2$$
  
Subject to  $s_1 + \dots + s_N = L$ 

$$\frac{\partial T}{\partial s_k} = 2 \cdot s_k + 2 \cdot s_N \cdot (-1) = 0 \Longrightarrow s_k = s_N$$
 , therefore,  $s_1 = s_2 = \cdots = s_n$ 

$$\tau_{all} = R_{input} \cdot (C_{wire} + N \cdot C_{in}) + R_{wire} \cdot C_{in} + \frac{R_{wire} \cdot C_{wire}}{2N}$$

That means the insert buffers should be the same type.

$$\frac{\partial T_{all}}{\partial N} = R_{input} \cdot C_{in} - \frac{R_{wire} \cdot C_{wire}}{2N} = 0 \Longrightarrow N = \sqrt{\frac{R_{wire} \cdot C_{wire}}{2 \cdot R_{input} \cdot C_{in}}}$$

However, since the  $R_{input}$ ,  $C_{in}$ ,  $R_{wire}$ ,  $C_{wire}$  are all unknown. So, I can not optimal N by calculation.

#### 3. Buffers' Parameters

Intuition, if I insert each of type buffer into wire then I can get the electronic preproty of each buffers. I can choose the best one of electronic property buffer insert wire and change the number of them, then get the minimize delay of the wire.



Figure 1: Each Buffer's Electronic Preproty

As the Figure 1 shown, the best electronic preproty buffers are BUF\_X4 and BUF\_X8.



Figure 2: Each Buffer's Delay (When the total insert buffer's number is 9)

Figure 2 states that, when I evenly insert 9 X1 buffers into wire the total delay is 1.289. The rest types of buffer total delay shown in the figure 2. The results demonstrated that the best electronic property buffers are BUF\_X4 and BUF\_X8.

## 4. Minimize The Total Delay



Figure 3: Minimize The Total Delay of BUF\_X4 and BUF\_X8

Based on the conclusion of above, we can get the minimize total delay as figure 3 shows. We can got the minimized total delay 0.662 when the total number of BUF\_X8 buffer are 10.

However, after I analysised the result of delay report (Figure 4 and Figure 5), I found two problems:

- (1) The head of the wire and the first inserted buffer's delay is extremely higher than the rest of buffers'; (**Insufficient optimization**)
- (2) The tail of the wire delay is extremely lower than the rest of buffers'. (Over optimization)

So, I am trying to reduce the delay of heads wire simultaneously increase the delay of tails wire. The problem is what is the degree of reducing or increasing the delay?

From the result we found that besides the first buffer, the remaining of buffers' sum of out/A and out/Z are equal. Therefore, I will reduce the head of wire delay equal to the average of the sum of out/A and out/Z. Then increase the tail of wire delay equal to the avarage of the sum of out/A and out/Z. That means I should move the entire buffer queues some distance toward to the head of wire.

For instance, the number of insert buffer is 11. Each segment length will be  $5000/12 = 416.66 \cdots$ . From the figure 5 we known that, the sum of out/A and out/Z is 0.051, the delay of input signal is 0.068. So, the first buffer location =  $(0.051/0.068) \times (5000/12) = 312.5$ .

So, the new first buffer location will be at distance signal input 312.5. The remaining segments still 5000/12.

| Path 1: VIOLATED Path Del<br>Endpoint: g_out (v)<br>Beginpoint: g_in[1] (^) t |          |                                    | e of 'e'           |                  |                |                  |
|-------------------------------------------------------------------------------|----------|------------------------------------|--------------------|------------------|----------------|------------------|
| Path Groups: {default}                                                        | LITYGETE | by teauring eug                    | e 01 @             |                  |                |                  |
| Analysis View: NG view ty                                                     | /n       |                                    |                    |                  |                |                  |
| - External Delay                                                              |          | 0.000                              |                    |                  |                |                  |
| + Path Delay                                                                  |          | 0.500                              |                    |                  |                |                  |
| = Required Time                                                               |          | 0.500                              |                    |                  |                |                  |
| - Arrival Time                                                                |          | 9.662                              |                    |                  |                |                  |
| = Slack Time                                                                  |          | 0.162                              |                    |                  |                |                  |
| Clock Rise Edge                                                               |          | 0.000                              |                    |                  |                |                  |
| + Input Delay                                                                 |          | 0.000                              |                    |                  |                |                  |
| = Beginpoint Arrival                                                          | Time     | 0.000                              |                    |                  |                |                  |
| +                                                                             |          |                                    |                    |                  |                |                  |
| Pin                                                                           | Edge     | Net                                | Cell               | Delay            |                |                  |
|                                                                               |          |                                    | l                  |                  | Time           | Time             |
|                                                                               |          | +                                  | +                  | +                | +              |                  |
| g_in[1]                                                                       |          | g_in[1]                            |                    |                  | 0.000          |                  |
| U1/A2                                                                         | ^        | q in[1]                            | NAND2 X1           | 0.000            | 0.000          | -0.162           |
| U1/ZN                                                                         | V        | FE ECONO g out                     |                    | 0.072            | 0.072          | -0.090           |
| FE_ECOCO_g_out/A                                                              |          | FE_ECONO_g_out                     |                    | 0.027            | 0.099          | -0.063           |
| FE_ECOCO_g_out/Z                                                              |          | FE_ECON1_g_out                     | BUF_X8             | 0.053            | 0.152          | -0.010           |
| FE_ECOC1_g_out/A                                                              |          | FE_ECON1_g_out                     | BUF_X8             | 0.021            | 0.173          | 0.011            |
| FE ECOC1 a out/Z                                                              | V        | FE ECON2 a out                     | BUF X8             | 0.035            | 0.208          | 0.046            |
| FE_ECOC2_g_out/A                                                              |          | FE_ECON2_g_out                     | BUF_X8             | 0.020            | 0.229          | 0.066            |
| FE ECOC2 q out/Z                                                              | V        | FE ECON3 q out                     | BUF X8             | 0.034            | 0.263          | 0.101            |
| FE_ECOC3_g_out/A                                                              |          | FE_ECON3_g_out                     |                    | 0.020            | 0.283          | 0.121            |
| FE_ECOC3_g_out/Z                                                              |          | FE_ECON4_g_out                     | BUF_X8             | 0.034            | 0.317          | 0.155            |
| FE_ECOC4_g_out/A                                                              |          | FE_ECON4_g_out                     | BUF_X8<br>  BUF_X8 | 0.021            | 0.339          | 0.176            |
| FE_ECOC4_g_out/Z                                                              |          | FE_ECON5_g_out                     | BUF_X8             | 0.035<br>  0.020 | 0.373<br>0.394 | 0.211            |
| FE_ECOC5_g_out/A                                                              | v<br>v   | FE_ECONS_g_out                     | BUF_X8             | 0.034            | 0.428          | 0.231            |
| FE_ECOC5_g_out/Z  <br>  FE_ECOC6_g_out/A                                      | V        | FE_ECON6_g_out<br>  FE_ECON6_g_out | BUF X8             | 0.034            | 0.428          | 0.266  <br>0.286 |
| FE ECOC6 g out/Z                                                              | V<br>V   | FE ECONO g out                     | BUF X8             | 0.034            | 0.448          | 0.321            |
| FE ECOCO_g_out/A                                                              | V        | FE ECON7_g_out                     | BUF X8             | 0.034            | 0.463          | 0.321            |
| FE ECOC7_g_out/Z                                                              | v        | FE_ECON8_g_out                     | BUF X8             | 0.020            | 0.538          | 0.375            |
| FE_ECOC7_g_out/2                                                              | V        | FE ECONS_g_out                     | BUF X8             | 0.034            | 0.558          | 0.396            |
| FE ECOCS g out/Z                                                              | v        | FE ECONS_g_out                     | BUF X8             | 0.034            | 0.592          | 0.430            |
| FE ECOC9 q out/A                                                              | v        | FE ECON9 g out                     | BUF X8             | 0.020            | 0.613          | 0.451            |
| FE ECOC9 q out/Z                                                              | v        | a out                              | BUF X8             | 0.034            | 0.647          | 0.485            |
| g out                                                                         | v        | g out                              | VBI                | 0.015            | 0.662          | 0.500            |
| 3                                                                             |          |                                    |                    |                  |                |                  |

Figure 4: Analyze The Result Of Inserting 10 BUF\_X8

| lysis View: NG_view_typ<br>xternal Delay |      | .000               |                 |       |                   |         |
|------------------------------------------|------|--------------------|-----------------|-------|-------------------|---------|
| Path Delay                               |      | .500               |                 |       |                   |         |
| Required Time                            |      | .500               |                 |       |                   |         |
| rrival Time                              | 0    | .662               |                 |       |                   |         |
| Slack Time                               | -0   | . 162              |                 |       |                   |         |
| Clock Rise Edge                          |      | 0.000              |                 |       |                   |         |
| + Input Delay                            |      | 0.000              |                 |       |                   |         |
| = Beginpoint Arrival                     | Time | 0.000              |                 |       |                   |         |
| Pin I                                    | Edge | l Net              | l Cell          | Delay | Arrival           | Require |
| PIN                                      | Eage | Net                | Cett            | Detay | Arrivat<br>  Time | Require |
|                                          |      |                    |                 |       | IIIIe             | I IIIIe |
| q in[1]                                  |      | q in[1]            |                 |       | 0.000             | -0.16   |
| U1/A2                                    |      | g in[1]            | NAND2 X1        | 0.000 | 0.000             | -0.16   |
| U1/ZN                                    | V    | FE ECONO q out     | NAND2 X1        | 0.068 | 0.068             | -0.09   |
| FE ECUCU Q OUT/A                         | V    | FE_ECUNU_g_out     | BUF X8          | 0.023 | 0.091             | -0.07   |
| FE ECOCO g out/Z                         |      | FE ECON1 g out     | BUF X8          | 0.051 | 0.142             | -0.02   |
| FE ECOC1 g out/A                         |      | FE ECON1 g out     | BUF X8          | 0.019 | 0.160             | -0.06   |
| FE ECOC1 a out/Z                         | V    | FE ECON2 a out     | BUF X8          | 0.033 | 0.194             | 0.03    |
| FE_ECOC2_g_out/A                         | V    | FE_ECON2_g_out     | BUF_X8          | 0.018 | 0.212             | 0.05    |
| FE_ECOC2_g_out/Z                         |      | FE ECON3 g out     | BUF_X8          | 0.033 | 0.244             | 0.08    |
| FE_ECUC3_g_OUT/A                         | V    |                    | ROF_X8          | 0.018 | U.262             | U.16    |
| FE_ECOC3_g_out/Z                         |      | FE_ECON4_g_out     | BUF_X8          | 0.033 | 0.295             | 0.13    |
| FE_ECOC4_g_out/A                         |      | FE_ECON4_g_out     | BUF_X8          | 0.018 | 0.313             | 0.15    |
| FE_ECOC4_g_out/Z                         |      | FE_ECON5_g_out     | BUF_X8          | 0.033 | 0.346             | 0.18    |
| FE_ECOC5_g_out/A                         |      | FE_ECON5_g_out     | BUF_X8          | 0.018 | 0.363             | 0.20    |
| FE_ECOC5_g_out/Z                         |      | FE_ECON6_g_out     | BUF_X8          | 0.033 | 0.396             | 0.23    |
| FE_ECOC6_g_out/A                         |      | FE_ECON6_g_out     | BUF_X8          | 0.018 | 0.414             | 0.25    |
| FE_ECOC6_g_out/Z                         |      | FE_ECON7_g_out     | BUF_X8          | 0.033 | 0.447             | 0.28    |
| FE_ECOC7_g_out/A                         |      | FE_ECON7_g_out     | BUF_X8          | 0.018 | 0.465             | 0.36    |
| FE_ECOC7_g_out/Z                         |      | FE_ECON8_g_out     | BUF_X8          | 0.033 | 0.497             | 0.33    |
| FE_ECOC8_g_out/A                         |      | FE_ECON8_g_out     | BUF_X8          | 0.018 | 0.515             | 0.35    |
| FE_ECOC8_g_out/Z                         |      | FE_ECON9_g_out     | BUF_X8          | 0.033 | 0.548             | 0.38    |
| FE_ECOC9_g_out/A                         |      | FE_ECON9_g_out     | BUF_X8          | 0.018 | 0.566             | 0.46    |
| FE_ECOC9_g_out/Z                         |      | FE_ECON10_g_out    | BUF_X8          | 0.033 | 0.599             | 0.43    |
| FE_ECOC10_g_out/A                        |      | FE_ECON10_g_out    | BUF_X8          | 0.018 | 0.616             | 0.45    |
| FE ECOC10 a out/Z  <br>  g out           | V    | l a out<br>l a out | BUF X8<br>  VBI | 0.033 | 0.649             | 0.48    |

Figure 5: Analyze The Result Of Inserting 11 BUF\_X8

After that, I try to remove a buffer from the end of the buffer queue. Then I got a minimized total delay as the figure 6 and 7 shows.



Figure 6: The Result Of New Way To Minimize 11 BUF\_X8 Delay



Figure 7: The Result Of New Way To Minimize The Total Delay

Clearly, when the number of BUF\_X8 buffers is 10 then I can get the minimal total delay 0.634.